A CPU-GPU hybrid approach for the unsymmetric multifrontal method
نویسندگان
چکیده
Multifrontal is an efficient direct method for solving large-scale sparse and unsymmetric linear systems. The method transforms a large sparse matrix factorization process into a sequence of factorizations involving smaller dense frontal matrices. Some of these dense operations can be accelerated by using a graphic processing unit (GPU). We analyze the unsymmetricmultifrontalmethod fromboth an algorithmic and implementational perspective to see how a GPU, in particular the NVIDIA Tesla C2070, can be used to accelerate the computations. Our main accelerating strategies include (i) performing BLAS on both CPU and GPU, (ii) improving the communication efficiency between the CPU and GPU by using page-locked memory, zero-copy memory, and asynchronous memory copy, and (iii) a modified algorithm that reuses the memory between different GPU tasks and sets thresholds to determine whether certain tasks be performed on the GPU. The proposed acceleration strategies are implemented by modifying UMFPACK, which is an unsymmetric multifrontal linear system solver. Numerical results show that the CPU–GPUhybrid approach can accelerate the unsymmetric multifrontal solver, especially for computationally expensive problems. 2011 Elsevier B.V. All rights reserved.
منابع مشابه
Unsymmetric-pattern Multifrontal Methods for Parallel Sparse Lu Factorization
Sparse matrix factorization algorithms are typically characterized by irregular memory access patterns that limit their performance on parallel-vector supercomputers. For symmetric problems, methods such as the multifrontal method replace irregular operations with dense matrix kernels. However, no e cient method based primarily on dense matrix kernels exists for matrices whose pattern is very u...
متن کاملAn Approach for Parallelizing any General Unsymmetric Sparse Matrix Algorithm
In many large scale scientiic and engineering computations, the solution to a sparse linear system is required. We present a partial unsymmetric nested dissection method that can be used to parallelize any general unsymmetric sparse matrix algorithm whose pivot search can be restricted to a subset of rows and columns in the active submatrix. The submatrix is determined by the partial unsymmetri...
متن کاملAn Unsymmetric-Pattern Multifrontal Method for Sparse LU Factorization
Sparse matrix factorization algorithms for general problems are typically characterized by irregular memory access patterns that limit their performance on parallel-vector supercomputers. For symmetric problems, methods such as the multifrontal method avoid indirect addressing in the innermost loops by using dense matrix kernels. However, no efficient LU factorization algorithm based primarily ...
متن کاملA Distributed CPU-GPU Sparse Direct Solver
This paper presents the first hybrid MPI+OpenMP+CUDA implementation of a distributed memory right-looking unsymmetric sparse direct solver (i.e., sparse LU factorization) that uses static pivoting. While BLAS calls can account for more than 40% of the overall factorization time, the difficulty is that small problem sizes dominate the workload, making efficient GPU utilization challenging. This ...
متن کاملNUMERICAL ANALYSIS GROUP PROGRESS REPORT January 1994 – December 1995
2 Sparse Matrices ……………………………………………………………………………… 4 2.1 The direct solution of sparse unsymmetric linear sets of equations (I.S. Duff and J.K. Reid) …………………………………………………………………………… 4 2.2 The design and use of algorithms for permuting large entries to the diagonal 2.6 Element resequencing for use with a multiple front solver (J. A. Scott) ………… 10 2.7 Exploiting zeros on the diagonal in the direct s...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Parallel Computing
دوره 37 شماره
صفحات -
تاریخ انتشار 2011